A comparative study of speaker adaptation techniques

نویسندگان

  • Leonardo Neumeyer
  • Ananth Sankar
  • Vassilios Digalakis
چکیده

In previous work, we showed how to constrain the estimation of continuous mixture-density hidden Markov models (HMMs) when the amount of adaptation data is small. We used maximum-likelihood (ML) transformation-based approaches and Bayesian techniques to achieve near native performance when testing nonnative speakers of the recognizer language. In this paper, we study various ML-based techniques and compare experimental results on data sets with recordings from nonnative and native speakers of American English. We divide the transformation-based techniques into two groups. In feature space techniques, we hypothesize an underlying transformation in the feature-space that results in a transformation of the HMM parameters. In model-space techniques, we hypothesize a direct transformation of the HMM parameters. In the experimental section we show how the combination of the best ML and Bayesian adaptation techniques result in significant improvements in recognition accuracy. All the experiments were carried out with SRI's DECIPHER TM speech recognition system [1][2].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

A Comparative Study of Several Incremental Adaptation Algorithms for Speaker Adaptation

We conduct a comparative study of five representative incremental HMM adaptation algorithms developed in the past few years. We report the experimental results of using these algorithms for on-line speaker adaptation in a continuous Mandarin Chinese speech recognition system. We identify the strength and weakness of individual algorithms and offer recommendations for practitioners to make intel...

متن کامل

A comparative study of adaptation methods for speaker verification

Real-life speaker verification systems are often implemented using client model adaptation methods, since the amount of data available for each client is often too low to consider plain Maximum Likelihood methods. While the Bayesian Maximum A Posteriori (MAP) adaptation method is commonly used in speaker verification, other methods have proven to be successful in related domains such as speech ...

متن کامل

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

Dependence of GMM adaptation on feature post-processing for speaker recognition

This paper presents a study on the relationship between feature post-processing and speaker modelling techniques for robust text-independent speaker recognition. A fully coupled target and background Gaussian mixture speaker model structure is used for hypothesis testing in this speaker model based recognition system. Two formulations of the Maximum a Posteriori (MAP) adaptation algorithm for G...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995